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Abstract 

We consider community detection in Degree-Corrected Stochastic Block Models (DC- 
SBM). We propose a spectral clustering algorithm based on a suitably normalized adja¬ 
cency matrix. We show that this algorithm consistently recovers the block-membership 
of all but a vanishing fraction of nodes, in the regime where the lowest degree is of order 
login) or higher. Recovery succeeds even for very heterogeneous degree-distributions. 
The used algorithm does not rely on parameters as input. In particular, it does not need 
to know the number of communities. 


1 Introduction 

Social and information networks are omnipresent in our daily lives and have been the 
interest of much recent research activity [25]. Studies have been focusing on local 
properties of network systems as well as their large-scale properties. Among those 
large-scale phenomena, community structure received a lot of attention. A wide variety 
of networks are found to have communities or blocks: groups of vertices with many links 
between themselves and substantially fewer to the rest of the network, or vice-versa. 
One of the fundamental problems in network inference considers the detection of such 
communities. See [24] and ]11] and references therein for an overview. 

In the present manuscript we consider an instance of a certain probabilistic model 
that might be fit on the observed data. One of the best known such models is the 
stochastic block model (SBM) [13]. In its simplest form, each of n vertices belongs 
to precisely one of K communities. Edges are independently drawn between different 
nodes with probabilities only depending on the block memberships of the involved 
vertices. This model is able to generate a diverse collection of random graphs, while it 
stays analytically tractable. 

In practice however, the SBM fails to accurately describe observed data: due to the 
stochastic inidentifiability of nodes in the same community, it does not allow for degree 
heterogeneity within blocks. The DC-SBM was proposed in [16] to overcome this issue. 
The DC-SBM allows, additional to a block-structure, the fitting of arbitrary degree 
sequences such that the expected degree of a vertex is independent of its community. 
The underlying paper deals with community detection in this model. 

Several methods for community detection can be found in the literature. They 
include, but are not limited to, modularity maximization [26], belief propagation [9] 
and spectral clustering. For the latter, see for instance [30] and the section of related 
work in the underlying manuscript. 
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Spectral algorithms employ eigenvectors of matrices representing network data to 
return non-local information of the network. The most commonly used matrices are 
the adjacency matrix and the (un)normalized graph laplacian [30]. In [32] the authors 
study the spectra of the adjacency matrix for networks possessing arbitrary degree 
distributions while simultaneously exhibiting a community-structure. They demonstrate 
that those spectra consist in general of two components: a part containing the bulk of 
eigenvalues and a separated part with outliers whose number is in general equal to the 
number of blocks present. 

The contribution of our paper is as follows: We demonstrate with a clean analysis 
that community detection in a moderately sparse DC-SBM is feasible under rather 
general conditions on the degree-sequence. 

More specifically, we consider the matrix H with entry (u, v) given by Huv = 
Q Auv if Auv = 1 and Huv = 0 otherwise (here A is the adjacency matrix of the 

graph and Du is the observed degree of vertex u), which we shall call the normalized 
adjacency matrix. 

We show that this matrix concentrates around a deterministic matrix P of rank 
L < K, when the minimum expected degree is as small as log(n). To establish this 
concentration-result, we use Lemma 6.2 below, which could be of independent interest, 
as a simple alternative to the commonly used Davis-Kahan theorem. 

Due to the underlying community structure, the matrix that has the first L eigenvec¬ 
tors of P as its columns has the nice property that it has only K different rows. Hence, 
due to this fact and the concentration of H around P, the rows of the corresponding 
eigenvector matrix of H considered as points in an L-dimensional euclidean space, must 
cluster around K centres. This property indicates that H is the right matrix to analyse 
when dealing with the DC-SBM. Indeed, associating each vertex with its corresponding 
row, we show in this paper that we retrieve the correct community of all but a vanishing 
fraction of nodes. 

Further, we point out a natural connection between H and a random walk on the 
observed graph. 

The organization of this article is as follows: First we formally introduce the DC- 
SBM together with necessary notations. Next we state our main result for community 
detection in this model, followed by a discussion in Section 4: a discussion of relevant 
literature, performance on real data, the conditions in the main theorem and a connection 
between H and random walks. Section 5 outlines the approach we take to prove the 
main theorem, which is accompanied by a statement of all auxiliary lemmas. Section 6 
contains algebraic preliminaries. All proofs are deferred to section 7. In the last section 
we give a suggestion for future research. 


2 Model and notations 


The Degree-Corrected Stochastic Block Model denoted by G{B, K, {au}u^i, {Du}u=i) 
is a generalization of the Erdos-Renyi classical model of random graphs. We introduce a 
random graph on V n}. We partition the set of vertices into K communities 

of Ukn members each: each vertex u is given a label cr„ £ S' := {1,..., K}. A weight 
Du is given to each vertex u to encode its expected degree. Without loss of generality 
we assume that Di < D 2 < • • • < Du- All weights and labels will depend on n, but this 
is suppressed in the notation here. For each pair (u, v), we include the edge (u, v) with 
probability 


P (u ~ u) 


0 


if u 7 ^ u 
\i u = V, 


( 2 . 1 ) 


where B £ is a symmetric matrix, independent of n and D = Di, 

the average weight. B may be chosen completely independent of the weights {Du}'^^i'. 
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all information about the community-structure is then captured by B alone. 

We make some further assumptions on the parameters of the model: For (2.1) to 
define a probability, we assume 


DuDy 

nD 




< 1 , 


( 2 . 2 ) 


for all u, V. 

The vector a — (qi, ..., uk) is assumed to be constant. Hence, the clusters are well 
balanced, as the size of each community grows linearly with n. Further, the average 
weight in a cluster, 

_ 1 ^ 

Di = - y 

ain 

14 = 1 

is assumed to be asymptotically a fraction of the average weight D. That is, we assume 
that there exists non-zero constants di,... ,dK, such that. 


lim = = di, 

n—foo £) 

for all i. Under this assumption, the following limit exists for all i, 


Mi = lim 

n—^oo 


Mi 

Er=iA 


K 

'y ( Bii^othdh, 
k = l 


(2.3) 


(2.4) 


where Mi ^ DiBi^i. 

We shall see in Section 4.4 that we need the following condition for the communities 
to be identifiable: we assume that for all i,l there exists j such that 


Mi Ml' 


(2.5) 


In the analysis that follows, we will consider the random graph in a moderately-sparse 
regime, that is we assume: either 


lim 

n —>-oo 


Di 

log(n) 


= OO, 


( 2 . 6 ) 


or, for some constant c < 1/2, 

r>2 

0, (2.7) 

Mk) and the convergence 
the weights: 

^ = f2(log(n)). (2.8) 

Note that under those assumptions, Du represents the expected degree of vertex u upto 
a multiplicative factor that depends only on the community Indeed, if Du denotes 
the observed degree of vertex u, then 


Di > Cs,M • log(n) and lim —- —>■ 

n—^oo Jl^ 

where Cb.wl is some constant depending on B, M = (Mi,... 
rate in (2.4). Further, we assume the following condition on 


l^u 

= ^ (M,„ - 

nD 

— Du M T ^n). 


(2.9) 
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where e„ < ^ ^ ^ = o„(l). 


2 

As an example, we let be any sequence such that n/2 of its elements are 

1 and the other n/2 elements are 2. Then, there are two equally-sized communities: 
K = 2 and a\ — 0:2 = 1/2. Let be any non-decreasing sequence with D\ > 0. 

Put 

^ a 
b 

for some constants a and b. Then 


B = 


nD 


if fTii — , 

otherwise . 


This is exactly the extended-planted partition model (EPPM) considered in [4]. 


3 Main Results 


Our aim is to retrieve the underlying community structure from a single observation 
of the random graph. We do this by analysing the spectral properties oi H G 
defined for u,v £V hj 




0 


-Alt 


if Auv — 1, 
otherwise , 


(3.1) 


where A is the adjacency matrix of the observed graph. We shall demonstrate that this 
matrix is close (in a sense to be specified below) to the matrix P defined for {u, v) as 


1 

nD 


(3.2) 


Denote the rank of P by L. Due to the community structure, L < K (see below for 
details). 

In the regime where (2.6) holds, let / be any function tending to zero, such that 


fin) » i + 
For the regime where (2.7) holds, let / 
fin) > i + 


1 l ^osjn) 

yiogH V 

be tending to zero in such a way that 

1 1 

\/log(n) log^/^(n) 


Further, let r(n) = 1/f{n)^^^. 

Algorithm 1 uses H to reconstruct the communities. 
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Algorithm 1 

(a) Calculate the average degree in the graph, call it Daverage- Let L be the number of 
eigenvalues of H that are in absolute value larger than /(n)/-Daverage- 

(b) Compute the first L orthonormal eigenvectors of H ordered according to their abso¬ 
lute eigenvalues. Denote these eigenvectors and their corresponding eigenvalues by 
ail,..., and Ai,..., Aj respectively. 

(c) Associate to each node u gV the vector 

Zu = {xi{u),...,xi{u)). ( 3 . 3 ) 

Cluster the vectors as follows: Pick T(n) pairs of vertices, label them 

{u{l),u'{u{T{n)),u'{T{n))). Calculate S{t) = v^||z„(() and e = 

min(.^('j)>^2/3(„) (5(f). Find a vertex m so that {u' : ^/n\\zm — 5«'|| < e/8} has 
cardinality larger than f^^^{n) n. Form a community consisting of all nodes in 
{u' : ^/n\\zJn — Zu'W < e/4}. Remove those nodes and iterate this procedure. 


We have: 

Theorem 3.1. Consider a DC-SBM Q{B, a, {au}"=i, {D„}"^i). Assume assumptions 
(2.2), (2.3), (2.5), (2.8) and either (2.6) or (2.7) to hold. Then, Algorithm 1 retrieves 
the community of all but a vanishing fraction of nodes. 

The first step estimates L. Indeed, by definition there are only L non-zero eigenvalues 
of P. Those are all of order 1/D and the corresponding first eigenvalues of H are of 
the same order. The remaining eigenvalues of H are negligible with respect to f(n)/D. 

Under the assumptions in Theorem 3.1, all but a negligible number of rows of 
the matrix having the first L eigenvectors of H as it columns, cluster for large n to 
within negligible distance of block-specific representatives that are separated by some 
non-vanishing gap (call the corresponding vertices typical). This is exploited in the 
third step. There, with high probability, all picked vertices are typical. Thus, for 
a pair t, S{t) vanishes in front of f^^^(n) if the vertices in the pair belong to the 
same community. Hence, by calculating the distance between the other vertices, we 
obtain e as an estimator for the gap mentioned above. At most f{n)^^^ n vertices are 
not typical. Hence, the chosen ball around m with radius e/8 contains a negligible 
number of non-typical vertices, the remaining vertices should necessarily be in the same 
community. By enlarging the radius of the ball around m, we include all vertices of a 
single community. See the proof of Theorem 3.1 below for more details. 

Remark 3.2. Note that the only input to the algorithm is the regime (i.e., either 
Di(n) = 0(log(n)) or D\{n) 3> log(n)}. This information is used to pick the right 
f07171 of the function f. Alternatively, we could adapt the algorithm so that it requires 
L — Rank(B) and amin instead of the regime: Step 1 can then be skipped, in Step 2 we 
replace L by L and in Step 3 we chose a vertex m that contains in its e/8 neighbourhood 
at least amin7if2 vertices. 


4 Discussion 

Before we prove the main theorem, we make some observations and remarks. 

4.1 Adjacency matrix 

In [19] and [21], the authors use the adjacency matrix A of a graph to recover the under¬ 
lying community-structure. They consider the matrix having the first K eigenvectors 
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of A as its columns and show that, under appropriate conditions, its rows cluster now 
in K different directions. However, results in [5] and [22] suggest that the algorithms 
in [19] and [21] fail when the expected degree sequence is too irregular. Intuitively, if 
the prescribed degree sequence follows a power-law, then so does the spectrum of the 
adjacency matrix. Further, as we shall demonstrate below, the first K eigenvectors 
correspond only to the K top-degree nodes, and should therefore not be expected to 
capture more global features of a graph, such as its underlying block-structure. The 
following theorem makes this observation more rigorous: 

Theorem 4.1. Consider a DC-SBM G{B, K, {au}Z=i, {Du}Z.^i) such that 


{ Di ifl<u<n — k 

“ Din'^(u + 1 — {n — k)) if u > n — k, 

where k = and the constants j3 and 7 obey: 


(4.1) 


(4.2) 


and 

7 > 4/3. 


(4.3) 


Further, assume that 


Under these conditions, the first k eigenvectors become for large n indistinguishable 
from the eigenvectors of a graph that is the disjoint-union of k stars having degrees 

Dn -|- 0(1 ),..., Dn—k + 0(1). 

For instance, Di{n) = /3 = 1/20 and 7 = 1/5, meets the assumptions in 

Theorem 4.1. Further, it verifies the conditions in the main theorem (Theorem 3.1): 
Algorithm 1 will successfully return the community membership of all but a vanishing 
fraction of nodes. 

We remark that the above theorem is inspired by the main result in [22]. There, 
random graphs without community structure are considered and the power-law behaviour 
of the corresponding spectrum is obtained. To say something about the eigenvectors, 
we additionally introduce a gap between the top k degreed-nodes and the remaining 
n — k nodes. This allows us to use Lemma 6.2, see the proof of 4.1 below. 


ifu<% 

ifu>^. 


(4.4) 


4.2 SCORE 

Interestingly, the first eigenvectors of A do contain information about the underlying 
community structure, but in a hidden way. Indeed, the SCORE method proposed in 
[15] shows that, under some conditions, using the coordinate-wise ratios of the leading 
eigenvectors leads to consistent clustering. 

Note that we obtain the same random graph model as in [15] by putting 6{u) ~ 
Du/'^nDa and P{i,j) = ctBa^a^, where a~^ = maxij- Bij. We further note that the 
conditions are more stringent: (2.7) demands that P (or B) is non-singular which is 
unnecessary here, see Remark 4.3 below. 

4.3 Laplacian 

As we just pointed out, the adjacency matrix does not capture accurately global 
properties of a graph. The normalized Laplacian is a more suitable candidate. It is 
defined by L = 7 — where I is the identity matrix, A is the adjacency 

graph and D the diagonal matrix with the row sums of A on its diagonal (i.e., the 
degrees). Object of study in [5] is the Laplacian spectra of random graphs with a given 
degree sequence (di, ... ,d„) where edges are independently present between each pair of 
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vertices {u,v) with probability In the regime df ;§> D, with D = dj, 

the eigenvalues satisfy the semicircle law with respect to the circle of radius 2I'/d 
centred at 1. 

Denote the eigenvalues of the normalized laplacian by 0 = Ai < A2 < •. • < An < 2. 
It is a well-known fact that all eigenvalues are located in the interval [0, 2] and that 
the algebraic multiplicity of 0 equals the number of components in the graph. The 
authors of [5] further study the spectral gap A = min{A2, 2 — An}, which reflects global 
properties of the random graph. According to [5], when di 3> log^(n), 

A > 1 - ^ + 

~ 4/%/hJ di 

thus in this dense regime, all non-zero eigenvalues are close to 1 and thus the spectrum 
of the Laplacian contains no outliers, in contrast with the adjacency matrix. This 
bound is improved in [6], to 


for di » log(n). 

The stochastic block model is a special case of the latent space model [12]. In 
this model a vector Vu is associated to each node u and an edge between u and v is 
present with probability depending only on and 2„. If A is the adjacency matrix 
of the graph, D the diagonal matrix containing the degrees and L = , 

then the population version of these matrices are defined as A = ¥,[A\zi,..., Zn] , 
V = diag Ali„, ■ ■ ■, ^ In [28] convergence of 

the empirical eigenvectors of L to the population eigenvectors of £. is shown. This 
follows from their novel result establishing the convergence of to in Frobenius 
norm. This forms the basis of an algorithm that uses the hrst k eigenvectors (according 
to the eigenvalues order decreasingly with respect to their absolute value). To recover 
the hidden communities in the SBM (thus, without degree-corrections). The algorithm 
is shown to succeed if those first k eigenvalues are sufficiently separated from the rest 
of the eigenvalues and if the minimum expected degree exceeds -^=, which is more 
restrictive than the lower bound of logn. 

In [8] the matrix E AW, (reminiscent of the normalized Laplacian) is 

used to retrieve the underlying community structure in the DC-SBM. Note that this 
method requires the expected degrees to be known. It succeeds if the minimum degree 
is of order log®n. 

To deal with low-degree nodes, the authors in ]4] use the degree-corrected random 
walk laplacian: I — {D + tI)~^A, where r > 0 is a constant, to hnd clusters in the 
extended planted partition mode (EPPM) where the expected minimum degree is 
fl(logn). In the EPPM, B is a matrix where an element equals p if it is on the diagonal, 
and q otherwise; it is thus a special case of the DC-SBM. The algorithm based on the 
random walk laplacian requires r as input and the optimal value of r depends in a 
complex way on the degree-distribution of the graph. The main theorem in [4] comes 
with lengthy conditions that are not easy to compare with other results. This theorem 
restricted to the setting where ah dAs equal d, assumes g to be a constant, which is 
more restrictive than our assumptions. It is unclear whether the results for the EPPM 
can be neatly generalized using the same operator to the DC-SBM, given the complexity 
of the present conditions. 

Although the Laplacian captures global properties of a graph much better than the 
adjacency matrix, its spectrum is still influenced by the underlying degree-structure. In¬ 
deed, consider a DC-SBM with 3000 vertices divided in A = 3 equally-sized communities. 
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degree-sequence 


( 

I (u- 1000)^''^ 

[ (u- 2000)^''^ 


if M = 1 ,..., 1000 
ifu= 1001,... ,2000 
ifu = 2001,..., 3000, 


and community-membership 


— 



if u = 1 ,..., 1000 
if u= 1001,..., 2000 
ifu = 2001,..., 3000. 


(4.5) 


(4.6) 


In Figure 1, we have plot the eigenvectors corresponding to the first and second largest 
absolute eigenvalue of / — E E [A] E , where A is the adjacency matrix 

and D is the diagonal matrix containing the row sums of A. The Laplacian concentrates 
around / — E E [A] E if the minimum degree is large enough (see Section 

8 ). The community structure is clearly perturbed by the degree-sequence. In general, 
an additional step is needed to determine the community-membership of all nodes when 
using the Laplacian. 

Compare this figure to Figure 2, containing the first two eigenvectors of 
E E [A] E The vertices are seen to be clearly divided into three communities. 



0.0195 

° 0.019 
tj 
> 

S’ 0.0185 

I 

0.018 

E 

^ 0.0175 
5 0.017 

0.0165 

-0.02 -0.015 -0.01 -0.005 0 0.005 0.01 0.015 0.02 0.025 

value of elements of first eigenvector 


K 


9K 




Figure 1: Plot of the eigenvectors 
corresponding to the first and sec¬ 
ond largest absolute eigenvalue oil — 
E E [A] E where A is the 

adjacency matrix of a random graph 
drawn according to the DC-SBM defined 
at the end of Section 4.2, and D is the diag¬ 
onal matrix containing the row sums of A. 
For those eigenvectors, say (cci,... ,x„)' 
and (t/i,..., UnY, we draw a dot (a:„, ?/„) 
for each element u. 


Figure 2: Plot of the eigenvectors corre¬ 
sponding to the first and second largest ab¬ 
solute eigenvalue of E E [A] E 
where A is the adjacency matrix of a ran¬ 
dom graph drawn according to the DC- 
SBM dehned at the end of Section 4.2, 
and D is the diagonal matrix containing 
the row sums of A. For those eigenvec¬ 
tors, say (cci,... ,a;„)' and (yi,...,?/„)', 
we draw a dot {xu,yu) for each element 
u. Note that many elements are repre¬ 
sented by the same dot, clearly reflecting 
the community structure. 




Now consider another two-community DC-SBM on n vertices with 


degree-sequence 

Du = 

and community-membership 


B={] 



V 

V 


log^(n) 

if M < 

n/2 

100 log^(n) 

if M > 

n/2 

_/ 1 

if u < n/2 


1 2 

if u > n/2. 



(4.7) 


(4.8) 


Then, according to Lemma 6.2, the eigenvectors of H become eventually indistin¬ 
guishable from the eigenvectors of the n x n matrix with zero-diagonal and all other 
elements equal to i. Clearly the communities can not be recovered from the latter 
matrix. 

The off-diagonal elements of E E [A] E are given by = 

with Z = . Now, Z has eigenvector ; corresponding to 

eigenvalue 101. The other eigenvalue is zero. So that the minimal gap between 
different eigenvalues of E [Z)]~^^^E [A]E is 2 — 0{l/n). According to [6], 

p — E E [A] E = o(l) w.h.p., where p{X) denotes the 

spectral radius of a matrix X. Consequently, Lemma 6.2 entails that for large n, 
clustering according to the eigenvector of ^ corresponding to its largest 

eigenvalue, reveals the community-membership of all but a vanishing fraction of nodes. 

Those two examples hint that whether the Laplacian L or the degree-normalized 
adjacency matrix H should be used depends on the correlation between the degrees 
and the communities, and the ’signal-strength’ of B. The first example shows that if 
the degrees are uncorrelated, L seems to add some extra noise, whereas H ’filters’ the 
degrees and reflects immediately the underlying communities. In the second example, B 
gives no information about the communities, but the vertices can be clustered according 
to their degrees. H ignores this degree-structure and thus fails to detect the communities. 
L on its turn still reflects the degree-sequence and therefore the communities. 

4.4 Regularized spectral clustering 

The paper [27] deals with the shortcomings of the Laplacian by inflating the degrees: 
Given a number r > 0, the regularized graph Laplacian [27, 4] is dehned as 

Lr = (4.9) 


where Dt = D + tI. 

The regularized spectral clustering algorithm in [27] starts with computing the matrix 
X = [Xi,X 2 , ■ • ■ , Xk], where Ai, X 2 , ■ ■ ■, Xk are the eigenvectors corresponding to the 
K largest eigenvalues. A matrix X* is then formed by projecting each row of X on the 
unit sphere. Considering each row of X* as a point in R^, and applying fc-means with 
K centres on these points gives an almost-exact clustering provided some conditions on 
5 -I- r (tf is the smallest expected degree) and the smallest strictly positive eigenvalue 
of Lr hold. In particular, condition (a) in Theorem 4.2 demands that 5 -I- r ^ log(n). 
Since simulation results suggest that r should be taken as the average degree, it is 
unclear if this method outperforms the algorithm proposed in the underlying paper. 

We note that [27] is the hrst work that relates the leverage scores (the euclidean 
norm of the rows of X) to the quality of the outputted clustering. 
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4.5 When does the degree-normalized adjacency matrix 
fail? 


Consider a DC-SBM with K > 2 communities, such that for two different communities 
i j, for all I, . Then, it can be verified that, for large n, in a dense enough 

regime, the eigenvectors of H corresponding to non-zero eigenvalues, do not distinguish 
between communities i and j. 

Further, the method breaks down in a too sparse regime. For instance, two low- 
degreed vertices connected by an edge cause the top eigenvectors to concentrate around 
them. We observed this when applying H on the sparse Political Blogs network [2], see 
Section 4.7. 


4.6 Degree-normalized adjacency matrix 

The same matrix H is used in [7] to perform community detection on the DC-SBM in 
the sparse regime (the minimum degree is bounded from below by a constant). The 
main restriction in their setting is that the minimum degree must be of the same order 
as the average degree, more precisely there exists e > 0 such that Di > eD for all i. 
Hence too much irregularity in the degree sequence is not captured. In this sense our 
work complements their results. 

Spectral clustering is performed in [7] on a minor of H where the rows and columns 
of vertices with a degree smaller than Daverage/log(n) (where Daverage is the observed 
average degree in the graph) are put to zero, which is not the same as leaving out 
completely the nodes with a too low degree. Due to the assumption that all expected 
degrees are of the same order, most observed degree will exceed the lower bound 

Daverage/log(n). 

There are alternative ways to deal with low degree nodes, see for instance section 8 
on future research. 


4.7 Performance on real networks: Karate Club, Dolphins 
and Political Blogs 

We have tested our method on 3 real networks, namely, Zachary’s karate club [31], 
the dolphin social network [20] and the political blogs dataset [2]. The error rate for 
Zachary’s karate club is 2/34 and for the dolphin social network 0/62. 

The error rate for the political blogs dataset is 230/1221 when thresholding the 
Frobenius eigenvector. We restricted to the giant component of 1221 nodes, as is 
common in most other works (the original data contained 1490 blogs). Our clustering 
is worse than obtained by SCORE (where the error rate is 58/1221), but similar to the 
non-backtracking matrix (where around 15 percent of the nodes are misclassified [17]). 

We observed that the leading eigenvectors are concentrated on a few nodes, due 
to the presence of certain problematic structures (such as two low-degreed vertices 
connected by an edge). However, the value of the Frobenius eigenvector on the remaining 
vertices is still correlated with their community-membership as can be observed in 
Figures 3 and 4. 

Figure 3 is a histogram of the Frobenius eigenvector restricted to the roughly 600 
nodes that have corresponding value in the interval [0,10“®]. The nodes seem to con¬ 
centrate around two centres according to their community. However, this phenomenon 
is only weakly visible (note that our theory does not apply for sparse graphs). 

In Figure 4 we have sorted the 1221 indices of the Frobenius eigenvector according 
to an increasing corresponding value: the community structure becomes then clear. 

We further observed that thresholding the eight-est eigenvector leads to only 160 
misclassified vertices. Interestingly, if we inflate the degrees by replacing H = 
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by //inflated = -— 2 (^ ’ obtain an error rate of 74/1221 by thresholding its 

second eigenvector. This suggests that initial misclassifications are indeed due to low 
degree nodes (the average degree is 27, but there are also many leafs present). 



Sorted Frobenius eigenvector 



Rank 


Figure 3: Histogram of the Frobenius 
eigenvector restricted to the roughly 600 
nodes that have corresponding value in 
the interval [0,10“®]. The colors repre¬ 
sent the communities. 


Figure 4: Ranking of the 1221 indices 
of the Frobenius eigenvector according to 
an increasing corresponding value. Rank 
1 is the node with smallest value in the 
eigenvector and rank 1221 the node with 
largest value. Colors indicate community- 
membership. 


4.8 Interpretation of the conditions 

Note that, since E/)u is related to according to (2.9), H normalizes the tendency of 
communities to connect by the average degree of their nodes and loses therefore some 
information about the graph. See the observations and remarks below: 

Observation 4.2. If, for some i,j,l G S, 

Bjj _ Bij 

Mi ^ Mr 

then 

E [ff edges between community i and j] E [ff edges between community I and j\ 

E [total degree of vertices in community i\ E [total degree of vertices in community 1] 

Remark 4.3. The identifiability condition is violated if there are distinct i and I and 
there exists some constant c > 0 such that 

Bij — cBij 

for all j. Indeed, in that case. Mi = cMi and thus 

Bij cBij Bij 
Mi ^ cMi ^ Ml ■ 

However, unlike the setting considered in [19, 15], it is not necessary for B to be full 
rank. Indeed, consider 


B = 
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0 

2 
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which has rank 2. Let ai = 02 = 0:3 = | and !)„ = iainlog^(n) for all i = 1,2, 3. 

Then it is easily verified that the identifiability condition is met. 

Note that g{B, K, {au}u=i, {Du}u=i) and g{B*, K, {ay,}Z=i, generate 

the same ensemble of random graphs whenever 

D D* 

Hence, the underlying block-matrix B cannot be estimated from a single observation of 
the graph. Rather, we may estimate 

(4.10) 


and, denoting the assigned community-membership (after applying our reconstruction 
algorithm) of Ihy n. 




y\ Hu-u 

— I ^-^v:Tv=j 

= , 1 ) 1 ) 


Bj] 

Mi Mj ■ 


(4.11) 


Hence, for a DC-SBM g{B, K, {au}Z=i, {D„}u^i), the matrix 

\Mi 

is identifiable, not B. It is due to this degeneracy of the DC-SBM and the structure of 
H that condition (2.5) in Theorem 3.1 is the best possible: 

Lemma 4.4. Consider a DC-SBM g{B, K, {cru}2^i, {Du]u=i)- Bix i and I, then the 
following are equivalent: 

(a) for all j we have 

Bij Bij 

~M~wr 

(b) there exist a DC-SBM g{B* ,K,{(Ju}Z=\j{D'b}'i.^i), with the same community- 
structure {cr„}u, such that for all j, 


D* _ D* 


and, for all u, v, 


DuB„ 


..Dy 


D 


D,, B„ 


.D* 


D* 


4.9 Random Walk point of view 

The matrix H is related to a random walk on an instance of the random graph. Indeed, 


H„, 


1 


J-lr, 



if .^uv — 1 , 
if Auv = 0 , 


since Auv = Aw is either 1 (in case edge uv is present) or 0 (when u and v are not 
connected). Now, Du = EILi observed degree, which we denoted here 

in increasing order: Di < D 2 < ■ • ■ < D„. Thus, 4^ is exactly the probability that a 
random walk (in an undirected graph without weights) jumps from vertex u to v, given 
that it is currently at vertex u. Denoting the latter probability by Pii(u —^ v), we see 
that 

JS„v = Pu(u —i v)Fv(v — >■ u) = Pu(u —i V ^ u). 
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due to the Markov property of the random walk. In other words, is the probability 
that a random walk currently at vertex u will consecutively traverse edge uv and back. 
Extending this observation to powers of H leads to: 

n 

{H )uv = ^ ^ lPu(u —>■ lk — 1 u)P„(u —>■ lk — 1 —>■ u), 

Zi——1 

the probability that a random walk, after traversing a path of length k starting at u 
and ending at v, subsequently traverses that path in the exact opposite direction. 
Further, note that 

{Di ,..., D„)H = ..., 

hence if v is an eigenvector of H with eigenvalue A, then 

n n 

^ ^ ^ - A ^ ^ DuVu- 

1 11 — 1 


Since it can be easily verified that H is primitive on connected components, the Perron- 
Frobenius theorem implies that the eigenvector v^max corresponding to the largest 
eigenvalue \max (which is positive) has only positive elements. Hence, 

'^max — ___ ^ ___ 

We may derive an upper bound by noting that the spectral radius is bounded from 
above by the maximal absolute row sum: 



^max 


< max 

U = 1 


y^Fuju^v^u) 


5 Outline of proof of main theorem 


In this section we consider the setting of Theorem 3.1. All lemmas here, except 
Lemma 5.4, assume either (2.6) or (2.7) to hold. Lemma 5.4 assumes condition (2.6): 
the minimum degree should grow faster than log(n). Lemma 5.5 assumes (2.7): the 
minimum degree is of order log(n). 

Our first objective is to show that H is close to some matrix P, in the sense that 
their difference W := H — P has negligible spectral radius relatively to that of P. Here, 
an entry (u, v) of P is defined as 


p _ 1 

uv -T—^ T 

nD M„„ Mo 


We relate P in turn to Z defined by 


= i j (z s. 

MiMi 


(5.1) 


(5.2) 


Indeed, we show that if 1 / = (i/(l), ■ • •, y{K))'^ is an eigenvector of Z with eigenvalue A, 
then (y(ai ),..., y{o'n))'^ fulfils that role for P with eigenvalue =A. As a consequence, 
the eigenvectors of P associated to non-zero eigenvalues are constant on blocks. 

Finally, we consider the matrix that has the first L eigenvectors of P as its columns. 
We show that the rows of this matrix cluster to within vanishing distance of block-specific 
representatives. We start by inspection of the difference 

W = H-P^{H-H) + {H-E [H]) + (E [H] - P), (5.3) 
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where H is defined as 


H 


UV 


„ 1 „ A 
ED „ ED „ 

0 


if ^uv — 1) 
otherwise . 


(5.4) 


Define 

A(P) = min{|A — fj.\ : X fi, X, jj. eigenvalue of P}, 
i.e., the smallest gap between consecutive eigenvalues. A crucial role will be played by 
Lemma 6.2 below, which says that to any eigenvector x of H there exists an eigenvector 
a; of P such that | |a; — | —>■ 0 as n —>■ oo, whenever 


P{W) 

A(P) 


—>■ 0 , 


as n —^ oo, where we recall that p{X) denotes the spectral radius of a matrix X. Hence, 
we need to calculate A(P): 

Lemma 5.1. The smallest gap between subsequent eigenvalues of P is given by 


A(P) = n(i). 


All terms in the right hand side of (5.3) have negligible spectral radius with respect 
to A(P): 

Lemma 5.2. The matrix E [H] is close to P in the following sense: 


Lemma 5.3. 

as follows: 


The matrix H concentrates with high probability around its expectation, 


p(H -E[H]) = 0 


Vlog(n) 


1 



Lemma 5.4. Consider the DC-SBM in the dense regime, where (2.6) holds. Then, for 
the spectral radius of the difference H — H it holds with high probability that 


p{H -H)=0 


( l ^os{n) \ _1 

\]l Pi(n) J D{n) 



Lemma 5.5. Consider the DC-SBM in the regime where (2.7) holds. Then, for the 
spectral radius of the difference H — H it holds with high probability that 


p{H -H) = 0 


logi/3(n) 


1 

Wn) 



We use these Lemmas in conjunction with Lemma 6.2 below to prove: 

Lemma 5.6. To each normed eigenvector x of H corresponds a normed eigenvector x 
of P such that 


X ■ X — 1 — O 



= l-o„(l). 


where 

p{W) < p{H -H)+p{H-E [H]) + p(E [P] - P). 
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Having proved this lemma, we show that Algorithm 1 indeed correctly reconstructs 
the community of all but a vanishing fraction of vertices. 

Recall the definition of H and observe that H is symmetric. Consequently, there 
exist n eigenvectors of H that form an orthonormal basis: thus, we are indeed able to 
find L orthonormal eigenvectors of H corresponding to its first eigenvectors. 

Next we show that the {zu)nev, defined in (3.3), tend to block-representatives: 
Lemma 5.7. There exist K vectors {tk}kes, i-S-, block-representatives, such that 


for all but O ^ ^ nodes. 

The remaining and crucial step is to demonstrate that those block-representatives 
are indeed distinct: 

Lemma 5.8. Assume that for all i,j there exists i! such that 


MiMv ^ MiMv 


(5.5) 


then \tk — ti\ = fl(l) for all k ^ 1. 


Proof of Theorem 3.1. After proving the above lemmas, it remains to show that L in 
step (1) of Algorithm 1 with high probability equals L. Further, we should verify that 
the procedure in step 3 forms the right clusters. For the first step notice the following: 
In the regime where (2.6) holds. 


P^W) -o\^+ ^ 


' ^log(n; 

and in the other regime, where (2.7) holds, 
p{W) = O 


+ 


-b 


^og{n) 

L»i 


Di yiog(n) log^/®(n) 


1 

W 

1 

15' 


Compare this to / as in Algorithm 1: depending on the regime, the term in parentheses 
goes to zero upon division by /(n). To see this, note that due to Bernstein’s inequality 
(6.4), equation (7.10), Du £ (1/2Mct„ , 3/2Mo-„)7?u for u = 1 and u = n with high 
probability. Hence D\ ( D„ ) is of the same order of magnitude as D\ (respective Dn). 
Now, due to Lemma 6.2 below, the first L eigenvalues of H are of order = — 0{p{W)) 'A> 
^5=^. The remaining eigenvalues are of order 0{p{W)) <C Further Daverage may 
be written as twice the sum of D{n^) independent Bernoulli random variables. It is 
thus with high probability a constant away from D. Hence L — L with high probability. 

In step 3, the probability that all picked pairs contain only typical vertices (i.e., 
whose corresponding rows cluster around K centres) is larger than (1 — 
which tends to one, since f^^^{n)T{n) —> 0 as n —^ oo. Thus, with high probability, for 
a pair t, S{t) vanishes in front of f^''^{n) if the vertices in the pair belong to the same 
community. S{t) is of order fl(l) otherwise. Hence, e, as defined in step 3 of Algorithm 
1, is of order 11(1), it thus estimates the separation-distance in Lemma 5.8. 

Further, at most f(n)^^^ n vertices are not typical. Hence, the chosen ball around 
m with radius e/8 contains at least (/(n.)^^® — f{n)^^^)n ^ f{n)^^^ n typical vertices. 
Those must necessarily belong to the same community. Since all typical vertices 
belonging to the same community are at most a distance /(n)^'^^ apart, all of them are 
located in the ball of radius e/4 around m. 

We see that the algorithm puts, with high probability, all but a vanishing fraction 
of nodes in K clusters. □ 
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6 Algebraic Preliminaries 

We shall make use of the following fact about the spectral radius: 

Lemma 6.1. If |X| < Y holds entry-wise for two real symmetric matrices X and Y, 
then 

p{X) < p{Y). 


Proof. Due to the Rayleigh-Ritz theorem, we have 

p(X) = max IIXzll. 
IhINi 

Hence, 

p(X) = max \\Xz\\ 

|p||=i 

< max II y HI 
“ lhll=i 

= max \\Yz\\ 
lhll=i 

= p{Y). 


□ 


The following lemma could be of independent interest as a simple alternative to the 
commonly used David-Kahan theorem: 

Lemma 6.2. Let A,SA be two n x n symmetric matrices. Let \\ >...> \n be 
the eigenvalues of A -\- 5A and pi >...> fj.„ be the eigenvalues of A. Let A = 
min{|/ii — pj\ : pi 7^ pj,pi,pj eigenvalue of A}. Assume that pi&A) < y. Let Vi be 
a normed eigenvector of A-\- SA corresponding to eigenvalue X, for any i = 1,... ,n. 
Then, 

(a) |Ai - pi\ < p{5A), 

(b) the dimension of the eigenspace Ei of A -\- 5A corresponding to the eigenvalue 
\i is no larger than the dimension of the eigenspace of A corresponding to the 
eigenvalue pi, 

(c) there exists a normed eigenvector Vi of A corresponding to eigenvalue pi such that 


Vi - Vi > 



2 


Proof, (i) is due to Weyl’s inequality (see for instance [14]). 

To prove {ii), let d be the dimension of Ei and write Xi = Ai+i = ■ • • = Ai+d-i- Since 
|Ai — pi\ < p{5A), we have \pi — pi+i\ < 2p(SA) < A. Thus pi = pi+i, and similarly 
for the other eigenvalues. 

To prove (Hi), we start with some notation: Let m be the number of distinct 
eigenvalues of A, denote those distinct numbers as 71 > • • ■ > 7^. Dehne Si = {u & 
{ 1 ,.. . ,n} : pu = 7i}, the set of indices of eigenvalues that are all equal to 7^. For 
u € { 1 ,.. ., n}, define € { 1 ,... , m} as the unique index such that u € St.,,, . Write 


Vi = 

3 

where {wj}j are orthonormal eigenvectors of A with associated eigenvalues {pj}j. 
Then, 

{A + (5A)vi = ^ ajPjWj + ((5H)vi. 

3 
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Hence, 


{5A)vi = aj{Xi - + Y 

Taking norms on both sides, 


{p{SA)f> Y E «"(A-A/2f = 1- E 


(A/2)^ 






jSS-r 


because, by definition \fj,i — jj,j\ > A A n Tj, and our observation \Xi — iJ.i\ < p{SA) < 
A/2. Put 

Vi = , E 


then 




teSx 


□ 


Lemma 6.3. Consider a square n x n symmetric zero-diagonal random matrix A 
such that its elements Auv = are independent Bernoulli random variables with 
parameters 

n 

where the auv ore constants independent of n and cj(n) = r2(log(n)). Then, with 
probability larger than 1 — 0 (;^), the spectral radius of A — [A] satisfies 


p(A-E[A])<C>(V^). 


Proof. This is precisely Lemma 2 in [29], where we quantified the term with high 
probability. We did this by choosing Ci > 3 in its proof. Note that the latter proof 
builds further on results by Felge and Ofek [10]. □ 

Lemma 6.4 (Bernstein’s inequality). Let Xi ,..., X„ be zero-mean independent random 
variables all bounded from above by one. Put cr^ ~ ^ X]u=i var(A„). Then, 


-Ea. >e <exp - 


2(cr2 + e/3) J ■ 


Proof. See [3]. □ 

Note that Bernstein’s lemma can easily be extended to the case of non-centred 
random variables. 

7 Proofs 

In the proofs below, we shall often write 

Du = </„aj(n), 

where 1 = />i < />2 < ■ • • < />n, and 

u{n) = Di. 


(7.1) 

(7.2) 
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Further, we introduce 


g{n)=^(j)i, (7.3) 

1 = 1 

Hn) = (7.4) 

n 

Proof of Lemma 5.1. Write 

p _ 1 _ 1 

nD Mcr^ nD aa-„ 

Let y = {y{l),... ,y{K))'^ be an eigenvector of Z with eigenvalue A, we show that 
w = (j/(cti), ... ,y{an))'^ is an eigenvector of P with eigenvalue =A. Indeed, 


Pw 


V Er=i 

( • y(o'i) 


1 

nD 

1 

nD 

1 


\ Er=l ■ y(cri) 

( Ylk=i^ Za^k/ak-y{k) 

V — 1 n OLk Zfj^kjoik ' y{kf 


\ >^y{<^n) 


i Aw. 
D 


Thus = A is an eigenvalue of P. 

For the other direction, note that if Ou = cr„, then row u and row w in P are identical. 
Hence, if w = (w(l),..., w(n))^ is an eigenvector of P corresponding to a non-zero 
eigenvalue, then 'w{u) = w{v). Let w = (tc((Ti),..., u;(cr„))^ be an eigenvector of P 
with eigenvalue A 7^ 0. By carrying out a similar calculation as above, we see that 
(ui(l),..., w{K))'^ is an eigenvector of Z with eigenvalue DX. 

The statement follows from this one-to-one correspondence between the eigenvectors 
of both matrices corresponding to non-zero eigenvalues. □ 


Proof of Lemma 5.2. Note that 

E [P] - P = E [P] - (P - diag(Pii,..., P„n)) -I- diag(Pii,..., P„„). 


Now, 


as diag(Pii, 
for u ^ V, 


p(diag(Pll, . . . ,Pnn)) = C> ^ 

\nD 


■. ,Pnn) contains only K different elements, each of order -=. Further, 


E [Huv] 


Du Du 
E E 

^(l + ^(n))^ 

= Puu + S{n)Puu, 


Ba. 


M 


CTy 


1 

' nD 
1 

nD 
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where S(n) = C>(e„), with, due to (2.9), < maxi tending to zero uniformly 

for all u,v. Hence, due to Lemma 6.1, 


p(E [H] - (P - diag(Pii,... ,P„„)) + diag(Pii,... ,P„„)) = 0{ — )=. 


□ 


Proof of Lemma 5.3. We start by introducing the constants Cij = — ^ 
maxij . Put for u < v, 

Xuv = Xvu = a~^uj^{n) {Huv — E [Huv ]), 
where uj(n) is defined in (7.2). That is. 


and a = 


Xuv - 


l + o(l) ( 1 1 


a \MrTuMau 4’u(l>v 


Ber ( 

nD 


-Ca 


j{n) 1 


(j) n 


with (j>u and (f defined in (7.1), respectively, (7.4). Due to our choice of a and the 
assumption that > 1 for all u, 


where 


Xuv e (1 + o(l)) \-Puv, 1 - Puv] .. 

Puv = -=-• 


Let Xuv = 1 + 0 ( 1 ) such that Xuv £ [—Pw, 1 — Puv]. We shall compare the symmetric 
zero-diagonal matrix X to the deviation from its expectation of another symmetric 
zero-diagonal matrix, where elements uv are given by Ber(p„„), for u ^ v. Since by 
assumption (2.8), 


uj{n) Di{n) 

= TiTT = 

0(n) D[n) 


(7.5) 


Lemma 6.3 applies. Following an argument given in [29], we construct a function Yw 
such that Yuv has values only in {—puv, 1 — Puv} and E j = Xuv. First, let 

{Uuv}u<v be independent uniformly distributed random variables. Fix u < v. Define, 
for X € [—Puv, 1 — Puv] and w G [0,1], 


and. 

Then, 

and. 


FuviXyW} - 1 Puv ^X<vu—Puv^ 

Yuv Yvu Puv (.Xuv Puv}. 

[PuviX uv ; U uv ) 1 Puv I ^UV ^ “ ^UV + p uv 1 



{^^Fuv UV ^ Uuv^ — puv 1 

Yuv ^ 

thus. 

r 1 ^ ' 

1 ^ 


E ^Yuv 

1-^ 

and. 

p(y^^ = 1 -puv) = E 



-f Puv - Puv . 


Hence, indeed, Yuv = Her (puv) — Pu 
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Let Y be the symmetric zero-diagonal matrix with each element uv given by Yuv, 
for u V. Then, according to Lemma 6.3, 


P p(Y) < O 


j{n) 


>1-0 (1/n^) . 


(7.6) 


We shall use this observation in the following comparison, 

p(x) =p(e [y|w]) <E[p(y)|w], 
by Jensen’s inequality. Put S' = E 


[piY)\x], 


we shall show that it is also upper 


bounded by O Juj(n)/4> ■ 


Firstly, note that \Y\ is element-wise dominated by the all-one matrix, hence 
p(y) < n. Secondly, the sigma-algebra generated by S is contained in the sigma-algebra 
generated by X. Hence, 


E 


[p(y)|si = E [e [p(y) |x] I s] = E [s|s] = s. 


Further, both Y and X take only finitely many different values, and thus p{Y) and 
S take values in a finite space. It therefore makes sense to consider, for t > 0, the 
function 

/?(.) = p(p(y)>t|s = .). 

We have, 

S = E [p(y)|S] < /3(S)n + (1 - /3(S))t, 

i.e., 

S-t 


PiS)> 


n — t 


Denote 7 = P (S > f -I- 1), then 

p(p(y)>t)=E[/i(s)i 

> E [/3(S)ls>t+i] 
^ 7 


n — t 


As a consequence, for t = 0 u){n)/4>j , by (7.6) one has 

P (S > f -b 1) = 7 < (n - f)P (p(y) > t) 
= (n — t)0 {l/n^) 

= 0{l/n). 


Therefore, 


<(1 + 0 ( 1 )) 


uj^{n) 


P(X) 


< o 


1 


(jxo^{n) 

where the first inequality stems from the fact that the order 1 + tn term in (2.9) holds 
uniformly over all vertices. Finally, due to (7.5), 


p{H-E.[H]) = O 


/ ^ 

1 

1 ‘^{n) 



= O 


V^og{n)) D 


1 
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□ 


Proof of Lemma 5.f. To prove this theorem we show that in the present setting, with 
high probability, 

{H H^uv — euvHuv ^ 


where, for some constant C and all large enough n, 

\euv\ < Ce{n), 

with 

e(n) 6_21og(n) ;^^ 

Y mini Mi (-o{n) 


log(n) 


Di(n) 

by assumption. Consequently, after an appeal to Lemma 6.1, 
p{H -H)< p{\H - H\) < Ce{n)p{H). 


(7.7) 

(7.8) 

(7.9) 


Since, 77 = E [77] + 77 — E [77], it follows from Lemmas 5.2 and 5.3 that 

which completes the proof. 

Consider the difference 

1 1 

75„ E7)„ e75„ ^ Ei3„ e 75„ i + 1 + D.-w. e5„ eB„ ~ e5„e75„ 

ED „ ED „ 

thus 

_ - D„ EDy - ,n(( -Du \ \ ( Ei3„ - D^ \ 

EDu E75„ \\ Ei5„ ) ) \\ E73„ J J ■ 


We quantify ■ Since Du is a sum of Bernoulli random variables with mean 

E = DuM^^il - o(l)), 

where the o(l) term follows from (2.9), we have for e{n) as in (7.8), the Bernstein’s 
inequality (see (6.4)), 


EDu - Du 


EDu 


> e(»i') < 2exp — 


e\n) 


2 + e(n)/3 


E 


N 


< 2exp 

<4 

n-^ 


Invoking this we establish the union bound 


E77i - 77i 


ETli 


< £{n), ..., 


EDu - Du 


EDu 


< (-{n) ] >1-^1 


>1-2 

n 


EDu - Du 


EDu 


(7.10) 
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as n —>■ 00. Hence, 


E = 




EDu 


< tin) for all « € H 


holds with high probability. Thus we establish (7.7): \tuv\ < 2e(n) + O(e^in)) < Cein), 
with C a large enough constant. 

We henceforth condition on E. Then, for u ^ v, 


Hu.-Huv = ( - 


^DuDy EDuEDy 

-t ^ A 

ED^ED^ 

- ^UvHuV • 


-A.U 


Proof of Lemma 5. 5. We define 


E(n) = 


logi/3(n) 


□ 


(7.11) 


and we shall call a vertex u good if \EDu — Du\ < £(n)ED„. We use this definition to 
split 

Ayy = M„y + Myy + Myy ” Myy , 


Dy Dy EDy EDy 


(7.12) 


where 


Myy = 


ML, = 


M'"' = 


11 11 

Dy Dy EDy EDy 
11 11 

Dy Dy EDy ED- 

11 11 


Dy Dy EDy ED. 

11 11 


AyyH ^y and y good} 5 
Ayyl-^y bad}? 
Ayyl^^y bad}, 


Ayyl^ 


uv^-[u and v bad}- 


Dy Dy EDy EDy 

We shall show that all terms in (7.12) have a negligible spectral radius compared to 
A(P). First note that the difference 

11 11 


may be written as 


Dy Dy EDy EDy ' 

1 


where 


EDy - Dy EDy - Dy 

tyy = --1--h U 


EDy 


EDy 


EDy - Dy 
EDy 


+ 0 


EDy — Dy 
EDy 


Now, similarly as in the proof of Lemma 5.4, there exists a constant C, such that 
tyy < Ce{n) if both u and v are good. Consequently, p{M) < C'e(n)p(i7). 

Next we analyse the other terms in (7.12). We start with M'^. The idea is that, 
although now 


1 1 


Dy Dy EDy EDy 


= O 


( 1 


VEP„ EDy 


22 

































the total number of non-zero elements in a column of M'^ is very small, so that its 
spectral radius indeed vanishes upon division by A(P). We note that 

bad}),^ bad}),j i 

SO that a similar statement holds for the maximal row sum of M'^. Obviously, < M’^, 
and so do their spectral radii. 

As a consequence of these observations, it thus suffices to prove our claim for 
To do so, we proceed in three steps: First we show that 


P(fi) 




> 1 - 2/n^ 


(7.13) 


From which it follows after a short computation that, with probability larger than 
1 — 2/n^, for all u,v, 


1 1 
Du Du 


1 1 

EDu EDu 


< 3- 


EDu ED 


V 


Keeping this in mind, it thus suffices to demonstrate that bad})at; has a spectral 

radius much smaller than the spectral radius of A. The column sum in the former equals 
the number of bad neighbours a vertex has. That is, the spectral radius is bounded by 
max„ Xu, where for u G V, 

Xu= Y. 

v£j\f{u) 

with, 

Zu ll{u is bad} • 

Caution is needed here as the indicator functions in (7.14) are not independent. 

In the second step we shall show that with high probability the number of edges 
between vertices in the neighbourhood of u is negligible compared to the expected 
degree of vertex u. That is. 


^mu))=] 


Y f ) > 1 - 2A 

yeJ\f{u) J J 


(7.15) 


where d;(n) is defined in (7.2). Hence, except for possibly le(n)aj(n) of them, the 
variables in (7.14) form an independent set (conditional on not having any neighbours 
among M{u)). 

The last step consists in showing that this leads to 

¥ (^Xu > e{n)0 (eDu)\£i,£ 2 ) = o(l/n). (7.16) 

The assertion follows now straightforwardly: with high probability, we have 


^ 1 


‘6 — 

max 

EDu 

V 

<3 L 

max 

EDu 

V 


e(n) 

< o 



ED. 

1 

ED. 


i-e{n)0 (eDu) 
n.. V / 


= O 


min„ EDu 
e(n) 


u){n) J 
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Now D = 0{uj{n)), since = n(log(n)). Consequently, due to the choice of e(n) in 

JJ (n) 

(7.11), 


p(M") = O 


uj{n) 


= O 


=o„(l)^ 


log^^^(n)y D{n) D{n) 


The first step, i.e. demonstrating equation (7.13), is easily carried out: Fix u G V 
and use Bernstein’s inequality (6.4) to verify the bound 


EDu - Du 


EDu 


> 1/2 I < 2exp ( 


[fi.]). 


Now, for n large enough, EDu > MCb,M logn, and by assumption, Cs^m from (2.7) is 
so large that > 21og(n). Hence, 

P (^l/2Ei5„ <Du< 3/2EDu^ > 1 - 2/n^ 

We proceed with the second step, i.e., (7.15). Put M = maxi Mi, B = maxij- Bij. 
Set C = max{l/2M, 5M^, H}. Consider, conditional on Du < 2ED„, 

9{n) 


^ 51/ 

x,y^N'{u) x,y^J\f{u) 


< Bin 


g{n) 


< Bin ( 5M(n), B 

V 9{n) 

^ r?" f ^2/ \ 

< Bin 5M d)„ijj (n), B - , , 

V ^ g{n) 

^ T3' 2/ N ^/>n<^(’1-) 

< Bin ( {n),C 


9(t 

where 4’u and gin) are defined in (7.1), respectively (7.3). We now show that 

TD- 2/ N ^'/nw(n) 

Bin Crpni^ in),C 


9{r~ 


> -e(n)a;(n) ) = o(l/n). 


First, note that 


Bin CS^Lu M,C —> -e(n)uj(n) < i , , ^ n C — 

V ^ gin) ) - 2^ ’ ^ V “ \H^)^(^)J V 9in) ) 


Using that (^) < i^)^, we have 

'cKuj\n) 

le(n)a;(n) 


(7.17) 


C^Win)\ ^ 

® e(n) 


= exp -e(n)tu(n)log 2Ce 


^(pluijn) 

e(n) 


< exp ( |e(n)a;(n)log(p(n)) + ie(n)a;(n)log (2Ce) ) , 
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where c < I from (2.7) is such that - > 0 (and thus 

2 ' ' log2/2{n)n‘: ' 9‘^(n) 

since g{n) — 0(n) in the particular setting of this lemma). Write 


°r.(l) 

log^/^(n) 


< e(w), 



< exp ^-ie(n)a;(n)log ((s(n))^"^)^ 
= exp f-^!-^e(n)aj(n)log(sr(n)) j , 


if n large enough. Combining these estimates, we see that (7.17) may be bounded from 
above by 

exp ^e(n)a;(n)log (g(n)) + i£(n)(:u(n)log (2Ce) 

< ^-^'-^e(n)a;(n)log(5(n))^ , 

since g{n) > 2Ce. Finally, since ^~^‘^ e(n)a;(n) > 2, for large n, 


P(£:2 ) = 1-p( ^ > ie(n)cu(n) I > > l-l/n^ (7.18) 

\x,yeN'{u) j 

that is (7.15). 

We proceed with the last step, i.e., establishing (7.16). Write, 




uSA/'{u):A/'{u)nA/'{u)#0 d6A/'(u) :A/'(i;)nA/'(u)=0 


We already know from (7.15) that the first sum is smaller than ^e{n)uj{n), with high 
probability. The variables in the second sum, {Z.u}vej\r(u):j\r(v)nj\r(u)= 0 , are independent. 
For such a vertex v G Af{u) that has no neighbour with u in common, we have 
D„ = d(, + 1, where 

d'^= V Ber 

, \ nD 

the degree of v outside N{u) U {u}. We show that u is a good vertex with high 
probability, by proving that d(, concentrates on its mean which on its turn is close to 
E . Firstly, define 



then 


E*[-] := E ^ \M{u), 82 ,Du < 2E j , 
E*K]= V 

X^J\ {U) ,X^U 


> 



DyDx 

nD 
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Secondly, we use Bernstein’s inequality (6.4) to prove that d!„ concentrates around 
E \ Dv\ upto a factor e(n) as in (7.11): 


’ (^d'^ > (1 + e(n))ED„ |a/’(m), £' 2 , Du < 2E ^ 
{e{n)Etd'u + (1 + e{n))o„{l))^ 


- V 2 (E.< + 1/3 (e(n)E*d(, + (1 + e(n))o„(l))) 

/ (.(n)E*d;)^(l+,(^)\ 

^ [ -^ J 

< exp (-C'e^(n)log(n)) , 
where we redefined C = |. Similarly, 

P (^d'u < (1 — e(n))ED„ |a/’(m), £ 2 , Tin < 2E j < exp (—Ce^(n)log(n)) . 

Hence each vertex v € Af{u) that has no neighbour with u in common is thus a 
good vertex with probability 2exp (—C'£^(n)log(n)). Consequently, conditional on 

M{u),£2, Du < 2E , 

< Bin ^2EDti, 2exp (—Ce^(n)logn)^ . 


E 

v£j\f {u)\j\f {v)r\J\f {u)=0 


We have. 


P ^Bin ^2EDu,2exp (—Ce^(n)logn)^ > ^e(n)EIl, 

/ 2EDu \ (r, I ri'^( \^ \\5£(»i)Ei5u 

- I ie(n)E5j (2-P(-C^^ Wl«g-)) 


\^(n)mDu 


- (2exp(-C'e"(n)logn)) 

= exp l^^e(n)EDu - Ce^(n)lognJJ 


t£(n)Ei3„ 


= o(l/n), 

since e(n) = l/log^/^(n). Hence, 


Xu > +EDu^ 


£i ,£2 ) = o(l/n). 


The last step ((7.16)) is completed by noting that uj(n) = O ^EDuJ ■ 
Proof of Lemma 5.6. All matrices in 

W = {H - H) + {H - E [H]) + (E [H] - P), 

are real and symmetric, hence, combining Lemmas 5.1 - 5.5, 

p{W) < p{H -H) + p{H-E [H]) + p(E [H] - P) 

D{n) 


□ 
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Employing Lemma 6.2 gives that to each eigenvector x of iL = P + VE corresponds an 
eigenvector x of P such that 


—-(SS) 


since A(P) = (l/P(n)). □ 

Proof of Lemma 5.7. Invoking Lemma 5.6, to each x^ (with eigenvalue Ai) there exists 
a normed eigenvector Xi (with eigenvalue Ai) of P such that 

Xi ■ Xi = 1 - fi(n), (7.19) 

with fi{n) = o„(l). We claim that all Ai are larger than zero (note that we refer here to 
a set of L eigenvalues). This can be seen as follows: From Lemma 5.1 we know that the 
first L eigenvalues of P are of order 1/D and all other eigenvalues are zero. By Lemma 
6.2, |Ai — Ai| < piW) <C 1/D, hence the first L eigenvalues of H are also of order 
fl {/I/d') — 0{p{W)) — fl (l/P), and the other n — L are of order 0{p{W)). Now, the 
L eigenvalues of H that are picked in Step 1 of Algorithm 1 are precisely those whose 
absolute eigenvalue exceeds f (n) /Daverage = {f{n)/D) 3> p{W), by construction of / 

in Section 3. Hence those eigenvalues must necessarily be of order H (l/P) (i-e., they 
are indeed non-zero) and L = L with high probability. 

Since x; corresponds to a non-zero eigenvalue, it follows from the proof of Lemma 
5.1 that Xi is constant on each block, i.e., Xi{u) — Xi{v) if ctu = C7„. Let be the 
value of Xi on block k £ S. Put 

tk = V^{xi^\...,x^^^). (7.20) 


Then, 


1 " 

1/n |{m e V : ||yn 2 „ - > T^}\ < ^ 

m=l 

n 

= l/r^^||(3:'“\...,S:W)-(4""\---,4"“’)ir 

U = 1 

L 

= \/T^ ^ ||xa; - Xk\\^ 
fc=i 
L 

fc=l 

to finish the proof, let T = fk{n)^ ^ = O ^ ^ = o„(l). □ 

Proof of Lemma 5.8. Below we shall make a spectral decomposition in terms of L 
orthonormal eigenvectors of Z that span the union of all eigenspaces corresponding 
to non-zero eigenvalues. Recall from the proof of Lemma 5.1 how we can obtain the 
eigenvectors of Z from the eigenvectors of P. 

Recall that by construction {xijfk]^ are orthonormal eigenvectors of H corresponding 
to non-zero eigenvalues spanning an L dimensional space. Recall further from the proof 


27 





of Lemma 5.7 that the corresponding eigenvectors {xijfLi of P are associated with 
non-zero eigenvalues. Lemma 6.2 {ii) entails that the space spanned by those {xi}|Li 
has also dimension L. And Lemma 5.6 implies that {xijfLi become an orthonormal 
set for n tending to infinity (because they become more and more aligned with the 
orthonormal set {xijfh]^). 

(k^ 

Let, as in the proof of Lemma 5.7, x\ ^ be the value of Xi on block k £ S. Note that 
riQ.k{x^^'^Y = 1 for i £ {1,..., L}. Putting yi = -yn(a:^^\ ..., x\^'^ Y, we see that 
each yi is a normalized eigenvector of Z in the sense that Yk ^k{yi{k)Y = 1. 

Now, assume for a contradiction that — t; | —>■ 0 as n —>■ oo: 

L L 

'^\\/nxf'’ - Ynx^i'’\^ = ^\yi{l) - yi{k)\‘^ 0. (7.21) 

i=l z=l 


We conclude that there exist orthonormal eigenvectors of Z, {y ^,... ,y^} (with eigen¬ 
values {AiliLi after a possible relabelling of indices), that span the range of Z, such 
that 

Vnik) ^y^il) 

for all u. The other K — L eigenvectors have zero as an eigenvalue. 

To proceed, consider matrix 


N = 


Bu 


M„M„ 


U,V 


If (a;(l),..., {x{K)Y is an eigenvector of Z then (yala;(l),..., i/ciKx{K)Y is an 
eigenvector of N, as is easily verified. Hence N has {(\/aiyi(l)i ■ ■ ■! \/^Kyi{K)Y}Yi 
as eigenvectors corresponding to non-zero eigenvalues and K — L eigenvectors with 0 as 
eigenvalue (which do not contribute to the spectral decomposition of N). Hence 


Thus, for all u, 


^ VYlyi{u)Xi^/Y;yi{v) 


Bku 

MkMu 


'^ymik)Xmym{u) 

m 


_ Bi„ 

^ MiMj 

violating assumption 5.5. 


□ 


7.1 Comparison to spectral analysis on the adjacency ma¬ 
trix 


Proof of Theorem f.l. This proof leans strongly on ideas borrowed from [22], where 
graphs without a community-structure are considered. Parts of their proof carry through 
for the DC-SBM considered here. Note that lim„_>cx> g{n)/n = 1. 

By definition, we require without lose of generality D\ < D 2 < • • • < Dn- However, 
we obtain the same graph (with now a decreasing degree-sequence) by a rearrangement 
of indices, if we put 


- 


U 

1 


if u < 1 < A: = 
if u > n^, 


(7.22) 
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where <j)\ = and = (jiu'jji'n) (with uj as in (7.2)). 


r 1 if M < t 

\ 2 if M > f. 


(7.23) 


Denote a sample of the random graph by G. We decompose G into the following graphs 
(exactly as in [ 22 ]) : 

• Gi, which is a union of vertex disjoint stars Si,..., Sk- Star S„ has as its center 
node u and as leaves those vertices from among {fc + 1 ,..., n} adjacent to u, bnt 
not adjacent to { 1 ,..., u — 1 }; 

• G'l is the graph consisting of all edges of G with one endpoint in {1,..., fc} and 
the other endpoint in {fc + 1,..., n}, except for those edges in Gi; 

• G 2 is the subgraph of G, which is induced by {1,..., fe}; 

• G 3 is the subgraph of G, which is induced by {fc + 1,..., n}. 

Further, let Fu be the subset of vertices in {fc+1,..., n} that are adjacent to {1,..., w—1} 
and let G be a constant, independent of n, whose value might change along the course 
of the proof. 

We claim that du, the degree of vertex m in Gi, concentrates around its mean. 
Indeed, consider 


du = ^ Ber 

l = k + l 


DuDi 


-B. 


^g{n)uj{n) 
where g is defined in (7.3). Then, 

oj{n)(j)u 


- Y. B®'' 

l&Fu 


DuDi 

g{n)uj{n) 


Ba 


du = E 


[d„j 


> 


gij: 


Y -GEIIFuj 

Z=fc+1 


which we bound from below by estimating E [[Tfuj], for u < k = n^: For large enough n. 


^{\Fu\] 


^ DiDu 


< 


G 


uj{n)cf>i 

9{n) 


n u — \ 


Y Y 

l=k+lv=l 


1 

V 


< Guj{n)(j)i ^ ^ 

gin) 

< Gijj{n)rF'^^^, 


after recalling the special choice for the degree sequence. 
Consequently, we have 


B 11 +B 12 B 11 +B 12 f n 

^ 2 ^ ^ 2 ^ gin) ) 


n 


Invoking large deviation theory on du (which is a sum of Bernoulli random variables), 

we deduce that _ 

P ^|d„ — du\ > Vc'dulognj < 2/n'' , (7.24) 

For c' > 0 a, constant. We take c' = 8 to establish (7.24) uniformly over all vertices. 
We next investigate A(Gi), the smallest gap between different eigenvalues of Gi. This 
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graph is the union of vertex disjoint stars with degree so that its spectrum is given 
by 

— 1, • • • ,±\/dfc — 1}. 

We claim that 

, - -y-3/3 

A(Gi) > C^JuJ{n)n ^ oo (7.25) 

with high probability. Indeed, define 

Xu = du± \/ c'd„logn, 


and note that with high probability and du+i < To investigate the 

difference we first bound du — du+i from below: 


du d' 


u + l ^ 


> 

> 


Bii + Bi2 


uj{n)(j>i 


n/g{n) 


- C 


uj{n)n 


7+2/3 

u 


u{u + 1) 

2 ^ u \ u+l ' 


^11 
Bii + Bi2 


) 




T-lgjn) 
u I + 1 


- C 


u{ny 


,7+2/3 


Bii + Bi2 , , , 1 1 

--- uj(n)(pi^—x 

4 nP nP 

Bii + Bi 2 uj{n)rP^^ 


B\\ + Bi 


-uj{n)rB' 


Next we show that the y/dulogn terms are negligible: 


due to (4.3). Hence, 
As a consequence. 


sj dulogn < y + ^ —n/g{n)Du\ogn 

< CLo{n)\og{n)nP+P 

< Cuj{n)n 2 

< u>{n)rB~^, 

xZ - 2/^+1 > CLo{n)rB~^. 

A(Gi) > min { Jdu - 1 - \Jdu+i - 1] 

iiS{l,...,fc} V / 


= mm 


tpu ^u+1 


> c 


Vxu - 1 + 

io{n)rB~^ 


/——r 7+/3 
^ijj(n)n 2 


j - 7 — 3/3 

= Cy/uj{n)n 2 ^ 


that is (7.25). 
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We continue with an inspection of Gi, that is, we focus on fh^ = _D„|G'i, the degree 
of vertex u in G'l, and show that 


fhu < 2 c^logn, 


(7.26) 


with high probability (here, rUu is the expectation of fhu)- We shall use this in 
combination with the fact that the spectral radius of a graph is bounded by its largest 
degree. 

Write 


= I] Ber 


9{n) 


-B 




This expression allows us to deduce an upperbound for m„, 

= E [fhu] 

(puUjin) 


< CE [Fu 


9{^) 


^ ( \ 7 + 2/3 7+/3 1 

< Cuj(n)n n - 

u g{n) 


< CiJ^{n) -, 

n 

which tends to zero due to (4.2). Standard bounds for Bernoulli random variables give 

(c'logn)^ 


P {[m-u — fhu] < c'logn) < 2 exp 


< 2 exp I — ^c'logn 


2(m„ + c'log(n)/3) 
1 
4 


fic'/i 

We conclude that, with probability at least 1 — 

ffiu < mu + c'logn < 2c\ogn, 

i.e., (7.26) holds. An identical estimate holds when u > k. 

We next bound the number of edges in G 2 , denoted by E{G 2 )- The square root of 
E(G 2 ) is an upper bound for the spectral radius of G 2 . 


E[|E(G2)|1 = G^^ 

< G' 


(j^u4^V^{gf) 


—' ^ q(n) 

Lo{n)n^ 

gin) 

27 + 4.3 


<C-, 

n 

vanishing for large n. Again, upon invoking standard large deviation theory, we have, 

2 


with probability at least 1 — 

E[|^^(G 2 )|] < 2 c'logn. 

Consider the degree of a vertex u > fc in G 3 , 


(7.27) 


E 




gir^ 

<C^n 

gin) 

< Gu}{n). 


Ba 
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Hence, 


P {Du\g 3 > Cu){n) + Vc'log(n)C'a;(n)^ < (7.28) 

Combining these observations leads to our assertion that the first k eigenvectors of 
A become undistinguishable of those of the k stars, when n tends to infinity. Indeed, 
split A according to the described graph-composition: 

A = A\gi + A\qi^ + A\g2 + 71|g3, 

and note that the spectral radii of A\qi^, A\a 2 and A\g 2 vanish in the presence of A(Gi). 
This follows because (as mentioned above) for any graph its spectral radius is bounded 
by the minimum of its largest degree and the square root of its number of edges. Hence, 
due to (7.26) - (7.28), 

p{A\g[) < 2c'logn, 
p{A\g 2 ) < V2c'logn, 

and 

P{A\g 3 ) < Cuj{n) + \/c'log(n)Ca;(n), 

with high probability. All those three bounds vanish indeed upon division by A(Gi) > 

7 — 3,9 

Cuj{n}n 2 . Lemma 6.2 finishes the proof. □ 


7.2 Interpretation of the conditions 


Proof of Remark f.2. Assume, 


Bjj 

Mi 


then. 


Now, put 


Bjj 

MiMj 

then 


Bij 

Ml 

MiMj ■ 


(7.29) 


Bjj _ anfiBijajfij 
MiMj ^ af^iMiMjUj'^^ ' 

We give a probabilistic interpretation to the terms appearing in the denominator: 

K 

nai(f>^Mi = nai(j>i y ] y ] 4'nBia^ 

fc=l u:cTu = k 
K 

= nak<l)kBik 

k=l 


{nai){nak)(t)iBik(l)k 

k=l 

K 

^ ^ ^ ^ 4^u ^ ^ 4^vBik 

k = l u:cTu=i v:(T.u = k 

E E E 


(7.30) 


uj{n) 


ijj{n) 


uj(n) 


k = l u:cr^=i v:(7u=k 
n 

u:<7u=i m=l 

{expected total degree of vertices in community i}- 
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An inspection of the numerator reveals 


Tl(y.i(l)^Bij (l)j(y,jTl — ^ ^ (j^u ^ ^ ^vBij 

u:cTu='i' v:cTy=j 

= ^ y y ¥{u^v) 

uj(n) ^ ^ 

u:(Tu = t v:cTv=j 

71 

= ^ ^ {expected i^edges between community i and j} 


Proof of Lemma f.f. Assume first that for some i and I we have for all j 

Bij = Bij 


(7.31) 


□ 


and, for all u, v, 


iBa 


g{n) 


iBa 


?(«-) 


with (j)u defined in (7.1) and g in (7.3) {(fu and fj are defined analogously). Fix j. Let 
a, P and 7 be any indices such that Oa = i,ap = j and = 1. Then, 


pa Bjj (j>p _ pa Bjj pp rj _ Pj^ ff(^) p 

g{n) g{n) pa pg g{n) 


and 


(p'y _ 07 Blj<l>^ ^ R . _ 07 0/3 ff ('^) o 

g{n) g(n) p^ pgg{n) 


implying that (since Bij = Bij) 


Bij — 


pa Pp g{n) ^ 
pa Pp g(n) 



Since j was arbitrary, there exist c such that for all j 


Bij — cBij, 

hereby violating the identifiability condition, as pointed out in Remark 4.3, i.e., 

Bij Bij 

Wi~Wi’ 

for all j. 

Now assume that (a) holds, that is 

Bij _ Bij 

Wi^ Jl’ 

for all j. Define for k,l G S and u G V 

Biel - Bh] 

Mk Ml 


and 


where 


pu = f{n)puMa^, 


f{n) 


J2v pyMg^ 
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Then, 


-_1 1_1 I Mi _ 1 


It ivij ivii iVlj 

and (as above, we define g analogously to g), 


l>U B< 7 u 4 ^'^ 

d{n) 


4^uB(y (j (pu 


E 

.B, 


U-t^CFuCTy 


9{ri) 


□ 


8 Future research 

8.1 Exact recovery 

The obtained clustering here is almost-exact: only a vanishing fraction of nodes is 
miss-classified. It is plausible that an exact clustering could be obtained from this 
clustering, by using it as input to the ” clean-up” algorithm presented in Section 7.2 of 
[1] or alternatively. Algorithm 2 in [23]. 


8.2 Non-constant B 


In the underlying paper we assumed B to be a constant matrix. The current analysis 
could be extended to a setting where B is allowed to change with n. We need however 
the existence of a constant 5 > 0 such that for all n, p{Z) > S for H to concentrate. 
For identifiability we need the existence of some e > 0 such that for all i,j and n. 


max^/ 


7^ 


> e. 


MaM-, — 


8.3 Sparser graphs 

The main issue with both the normalized adjacency matrix and the Laplacian is proving 
when those matrices concentrate around a deterministic matrix. For the Laplacian, 
if the degrees are of order n(log(n)), matrices concentrate according to [6]. But, if 
the minimum degree is of order o(log(n)), the graph is seen to have some isolated 
vertices. Those contribute to multiple zeros in the spectrum: hence the matrix does not 
concentrate. There are multiple ways to overcome this issue, for instance removing the 
low-degree vertices or raising all the degrees. The latter strategy is proposed in [18] for 
the inhomogeneous Erdos-Renyi random graph (where edges are independently present 
with probabilities {Puv)^,v=i) ^nd also in [27, 4] (see Section 4.4) for the the DC-SBM. 
According to [18], for r ~ d, with d = nmaxu„p„„, with high probability, 

p(b.- (e[B,]-'/"E[A]E[B,]-i/2)) =o(^) . 

where Lt is defined in (4.9). 

Based on these observations, it might be fruitful to use H on a, graph where the 
degrees have been artificially inflated. 
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